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BACKGROUND 

The present invention relates to multi-dimensional integrated circuits. 

Traditionally, integrated circuit (IC) devices such as custom, semi-custom, or 
20 application specific integrated circuit (ASIC) devices have been used in electronic 
products to reduce cost, enhance performance or meet space constraints. However, the 
design and fabrication of custom or semi-custom ICs can be time consuming and 
expensive. The customization involves a lengthy design cycle during the product 
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definition phase and high Non Recurring Engineering (NRE) costs during manufacturing 
phase. Further, should errors exist in the custom or semi-custom ICs, the 
design/fabrication cycle has to be repeated, further aggravating the time to market and 
engineering cost. As a result, ASICs serve only specific applications and are custom built 
5 for high volume and low cost applications. 

Another type of semi custom device called a Gate Array customizes modular 
blocks at a reduced NRE cost by synthesizing the design using a software model similar 
to the ASIC. The missing silicon level design verification results in multiple spins and 
lengthy design iterations. 

10 In recent years there has been a move away from custom or semi-custom ICs 

towards field programmable components whose function is determined not when the 
integrated circuit is fabricated, but by an end user "in the field" prior to use. Off the 
shelf, generic Programmable Logic Device (PLD) or Field Programmable Gate Array 
(FPGA) products greatly simplify the design cycle. These products offer user-ftiendly 

15 software to fit custom logic into the device through programmability, and the capability 
to tweak and optimize designs to optimize silicon performance. The flexibility of this 
programmability is expensive in terms of silicon real estate, but reduces design cycle and 
upfront NRE cost to the designer. 

FPGAs offer the advantages of low non-recurring engineering costs, fast 

20 turnaround (designs can be placed and routed on an FPGA in typically a few minutes), 
and low risk since designs can be easily amended late on in the product design cycle. It is 
only for high volume production runs that there is a cost benefit in using the more 
traditional approaches. However, the conversion from an FPGA implementation to an 
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ASIC implementation typically requires a complete redesign. Such redesign is 
undesirable in that the FPGA design effort is wasted. 

Compared to PLD and FPGA, an ASIC has hard-wired logic connections, 
identified during the chip design phase, and need no configuration memory cells. This is 
5 a large chip area and cost saving for the ASIC. Smaller ASIC die sizes lead to better 
performance. A full custom ASIC also has customized logic functions which take less 
gate counts compared to PLD and FPGA configurations of the same functions. Thus, an 
ASIC is significantly smaller, faster, cheaper and more reliable than an equivalent gate- 
count PLD or FPGA. The trade-off is between time-to-market (PLD and FPGA 

1 0 advantage) versus low cost and better reliability (ASIC advantage). 

There is no convenient migration path from a PLD or FPGA used as a design 
verification and prototyping vehicle to the lower die size ASIC. All of the SRAM or 
Anti-fuse configuration bits and programming circuitry has no value to the ASIC. 
Programmable module removal from the PLD or FPGA and the ensuing layout and 

15 design customization is time consuming with severe timing variations from the original 
design. 
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SUMMARY 

In one aspect, a three-dimensional semiconductor device with two selectable 
manufacturing configurations includes a first module layer having a plurality of circuit 
blocks; and a second module layer formed substantially above the first module layer, 
5 wherein in a first selectable configuration a plurality of memory circuits are formed to 
store instructions to control a portion of the circuit blocks, and wherein in a second 
selectable configuration a predetermined conductive pattern is formed in lieu of the 
memory circuit to control substantially the same portion of the circuit blocks. 

Implementations of the above aspect may include one or more of the following. 

10 A third module layer can be formed substantially above the first module layer, wherein 
interconnect and routing signals are formed to connect the circuit modules within the first 
and second module layers. The second module layer in its first configuration can contain 
isolated through connections to connect the first module layer to the third module layer. 
A third module layer can be formed between the first and second module layers, wherein 

15 interconnect and routing signals are formed to connect the circuit modules within the first 
and second module layers. The first selectable configuration forms a programmable logic 
device (PLD) with one or more digital circuits formed on the first module layer; one or 
more programmable logic blocks formed on the first module layer and electrically 
coupled to the digital circuits; one or more memory blocks formed on the first module 

20 layer and electrically coupled to the digital circuits; one or more configurable memory 
elements formed on the second module layer and electrically coupled to the 
programmable logic blocks to customize the programmable content of the PLD; and one 
or more interconnect and routing signals formed in a third module layer, electrically 
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coupled to first and second module layers to provide the functionality of the PLD. The 
second selectable configuration forms an Application Specific Integrated Circuit (ASIC) 
with one or more digital circuits formed on the first module layer; one or more 
programmable logic blocks formed on the first module layer and electrically coupled to 
5 the digital circuits; one or more memory blocks formed on the first module layer and 
electrically coupled to digital circuits; one or more predetermined connections formed on 
the second module layer and electrically coupled to the programmable logic blocks to 
customize the programmable content; and one or more interconnect and routing signals 
formed in a third module layer and electrically coupled to first and second module layers. 

10 The second module layer can be generic and user configurable to program and re- 
program to alter the functional response and performance of the PLD. The predetermined 
conductive pattern can be positioned substantially above the digital circuits. The 
predetermined conductive pattern can also be integrated in the first module layer or 
, alternatively can be integrated in the third module layer. For every given memory pattern 

15 of the second module layer in the first configuration, a unique predetermined connection 
pattern can exist in the second configuration to substantially match logic customization. 
One or more of the circuit blocks within the first module layer can maintain substantially 
identical timing characteristics under both configurations of second module layer logic 
control. The memory circuit can include one or more thin film devices such as thin film 

20 transistors (TFTs), resistors and capacitors. The replaceable memory can be selected 
from the group consisting of fuse links, antifuse capacitors, SRAM cells, DRAM cells, 
metal optional links, EPROM cells, EEPROM cells, Flash cells, and Feiro-electric 
elements. The digital circuit can include a third-party IP core. The digital circuit 
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includes a processor capable of executing software logic instructions and other 
programmable logic blocks, wherein die programmable logic block is selected from one 
or more of a pass gate logic, multiplexer logic, truth table logic, or an AND/OR logic. 
The module layer one can include a substrate layer, n-well & p-well layers, field isolation 
5 regions, NMOS & PMOS gate, drain, source regions of transistors built on substrate, N+ 
& P+ diodes, resistors and capacitors built on substrate, gate oxide, gate poly, salicided 
regions, inter layer dielectric and contacts. 

In another aspect, a programmable logic device includes one or more digital 
circuits formed on a substrate; and a non-planar circuit electrically coupled to the digital 

10 circuits, the non-planar circuit being either a memory constructed to store data to define 
the logic outputs of the digital circuits to fabricate a field programmable gate array 
(FPGA) or a conductive pattern constructed to define the logic outputs of the digital 
circuits to fabricate an application specific integrated circuit (ASIC), wherein the memory 
and the conductive pattern options have substantially matching functionality timing 

15 characteristics. 

Implementations of the above aspects may include one or more of the following. 
The IC product is re-programmable in its initial stage with turnkey conversion to an 
ASIC. The IC has the end ASIC cost structure and FPGA re-prognunmability. The IC 
product offering occurs in two phases: the first stage is a generic FPGA that has re- 

20 programmability containing a programmable module, and the second stage is an ASIC 
with the entire programmable module replaced by 1 to 2 customized hard-wire masks. 
Advantages of the IC may include one or more of the following. A series product 
families can be provided with a modularized programmable element in an FPGA version 
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followed by a turnkey custom ASIC with the same base die with 1-2 custom masks. The 
vertically integrated programmable module does not consume valuable silicon real estate 
of a base die. Furthermore, the design and layout of these product families adhere to 
removable module concept: ensuring the functionality and timing of die product in its 

* 

5 FPGA and ASIC canonicals. These IC products can replace existing PLD and FPGA 
products and compete with existing Gate Arrays and ASIC's in cost and performance. 

An easy turnkey customization of an ASIC from an original smaller PLD or 
FPGA would greatly enhance time to market, performance, low cost and better reliability. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 shows a cross sectional view of a first embodiment of an integrated circuit 

Fig. 2 shows a cross sectional view of a second embodiment of an integrated circuit. 

Fig. 3 shows a cross sectional view of a third embodiment of an integrated circuit 

Fig. 4 shows a cross sectional view of a fourth embodiment of an integrated circuit 

Fig. 5 shows an exemplary AND-OR PLD Architecture. 

Fig. 6 shows an exemplary AND-OR array gate realization of PLD. 

Fig. 7 shows one EEPROM implementation of a P-Term logic array. 

Fig. 8 shows P-term configuration for SRAM/hard-wired PLD architecture. 

Fig. 9 shows an exemplary pass-gate logic. 

Fig. 10 shows an exemplary 4-Input logic MUX. 

Fig. 1 1 shows an exemplary 2-Input Truth Table. 

Fig. 12 shows a logic tree implementation of a 4-Input Truth Table. 

Fig. 13 shows an exemplary 6T SRAM. 

Fig. 14 shows pass gate transistor logic controlled by SRAM. 

Fig. 1 5 shows one embodiment of a 5x6 switch matrix. 

Fig. 16 shows pass gate controlled by Vcc (power) or Vss (ground) 

Fig. 17 shows the 5x6 switch matrix 
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DESCRIPTION 

In the following detailed description of the invention, reference is made to the 
accompanying drawings which form a part hereof, and in which is shown, by way of 
illustration, specific embodiments in which the invention may be practiced These 
5 embodiments are described in sufficient detail to enable those skilled in the art to practice 
the invention. Other embodiments may be utilized and structural, logical, and electrical 
changes may be made without departing from the scope of the present invention. 

The terms wafer and substrate used in the following description include any 
structure having an exposed surface with which to form the integrated circuit (IC) 

10 structure of the invention. The term substrate is understood to include semiconductor 
wafers. The term substrate is also used to refer to semiconductor structures during 
processing, and may include other layers that have been fabricated thereupon. Both wafer 
and substrate include doped and undoped semiconductors, epitaxial semiconductor layers 
supported by a base semiconductor or insulator, SOI material as well as other 

15 semiconductor structures well known to one skilled in the art The term conductor is 
understood to include semiconductors, and the term insulator is defined to include any 
material that is less electrically conductive than the materials referred to as conductors. 
The following detailed description is, therefore, not to be taken in a limiting sense. 

The term module layer includes a structure that is fabricated using a series of 

20 predetermined process steps. The boundary of the structure is defined by a first step, one 
or more intermediate steps, and a final step. The resulting structure is formed on a 
substrate. 
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The term configuration circuit includes one or more configurable elements and 
connections that can be programmed for controlling one or more circuit blocks in 
accordance with a predetermined user-desired functionality. In one embodiment, the 
configuration circuits include a plurality of memory circuits to store instructions to 
5 configure an FPGA. In another embodiment, the configuration circuits include a first 
selectable configuration where a plurality of memory circuits is formed to store 
instructions to control one or more circuit blocks. The configuration circuits include a 
second selectable configuration with a predetermined conductive pattern formed in lieu 
of the memory circuit to control substantially the same circuit blocks. The memory 

10 circuit includes elements such as diode, transistor, resistor, capacitor, metal link, among 
others. The memory circuit also includes thin film elements. In yet another 
embodiment, the configuration circuits include a predetermined conductive pattern, via, 
resistor, capacitor or other suitable circuits formed in lieu of the memory circuit to 
control substantially the same circuit blocks. The term "horizontal" as used in this 

15 application is defined as a plane parallel to the conventional plane or surface of a wafer or 
substrate, regardless of the orientation of the wafer or substrate. The term "vertical" refers 
to a direction perpendicular to the horizontal direction as defined above. Prepositions, 
such as "on", "side", "higher", "lower", "over" and "under" are defined with respect to the 
conventional plane or surface being on the top surface of the wafer or substrate, 

20 regardless of the orientation of the wafer or substrate. 

Fig. 1 shows a cross sectional view of a first embodiment of an integrated circuit 
that can be selectably fabricated as either an FPGA or an ASIC. In this embodiment, a 
three-dimensional semiconductor device 100 is shown. The device 100 includes a first 
• 
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module layer 102 having a plurality of circuit blocks 104 embedded thereia The device 
100 also includes a second module layer 106 formed substantially above the first module 
layer 102. One or more configuration circuits 108 are formed to store instructions to 
control a portion of the circuit blocks 104. In the first selectable option, circuits 108 are 
5 programmable to build FPGA products. In the second selectable option, circuits 108 are 
wire connections to build ASIC products. In the embodiment of Fig. 1, wiring/routing 
circuits 112 are formed on a third layer 110 above the second layer 106. Circuits 112 
connect to both circuits 104 and 108 to complete the functionality of the PLD. 

Fig. 2 shows a cross sectional view of a second embodiment of an integrated 

10 circuit that can be selectably fabricated as either an FPGA or an ASIC. In this 
embodiment, a three-dimensional semiconductor device 120 is shown. The device 120 
includes a first module layer 122 having a plurality of circuit blocks 124 embedded 
therein. The device 120 also includes a second module layer 126 formed substantially 
above the first module layer 122 that includes wiring and/or routing circuitry 128, and a 

15 third module layer 130 formed substantially above the second module layer 126 that 
includes configuration circuits 132. In the first selectable option, circuits 132 are 
programmable to build FPGA products. In the second selectable option, circuits 132 are 
wire connections to build ASIC products. The wiring/routing circuitry 128 is electrically 
connected to the circuit blocks 124 and to configuration circuits 132 in a third module 

20 layer 130. The configuration circuits 132 store instructions to control a portion of the 
circuit blocks 124. 

Fig. 3 shows a third embodiment which is substantially similar to the embodiment 
of Fig. 2. In the embodiment of Fig. 3, a fourth layer 140 having wiring/routing circuitiy 
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142 is position above the third layer 130. The wiring/routing circuitry 142 is electrically 
connected to one of the following: one or more circuit blocks 124, one or more 
wiring/routing circuitry 128, and one or more configuration circuits 132. 

Fig. 4 shows one implementation where the configuration memory element is 
SRAM. First, silicon transistors 150 are deposited on a substrate. A module layer of 
removable SRAM memory cells 152 are positioned above the silicon transistors 150, and 
a module layer of interconnect wiring or routing circuit 154 is formed above the 
removable memory cells 152. In the first selectable option, SRAM cells 152 are 
programmable to build FPGA products. In the second selectable option, cells 152 are 
replaced with wire connections to build ASIC products. To allow this replacement, the 
design adheres to a hierarchical layout structure. As shown in Fig. 4, the SRAM cell 
module is sandwiched between the single crystal device layers below and the metal layers 
above electrically connecting to both. It also provides through connections "A" for the 
lower device layers to upper metal layers. The SRAM module contains no switching 
electrical signal routing inside the module. All such routing is in the layers above and 
below. Most of the programmable element configuration signals run inside the module. 
Upper layer connections to SRAM module "C are minimized to Power, Ground and 
high drive data wires. Connections "B" between SRAM module and single crystal 
module only contain logic level signals and replaced later by Vcc and Vss wires to build 
the ASIC. Most of the replaceable programmable elements and its configuration wiring is 
in the "replaceable module" while all the devices and end ASIC wiring is outside the 
"replaceable module". In other embodiments, the replaceable module could exist between 
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two metal layers or as the top most layer satisfying the same device and routing 
constraints. 

Fabrication of the IC also follows a modularized device formation. Formation of 
transistors 150 and routing 154 is by utilizing a standard logic process flow used in the 
5 ASIC fabrication. Extra processing steps used for memory element 152 formation are 
inserted into the logic flow after circuit layer 150 is constructed. A full disclosure of the 
vertical integration of the TFT module using extra masks and extra processing is in the 
co-pending incorporated by reference applications discussed above. 

During the customization, the base die and the data in those remaining mask 

10 layers do not change making the logistics associated with chip manufacture simple. In 
one embodiment, the custom wire connections can be combined with the contact in 
module-1 and metal-1 in rnodule-2 processing. In another embodiment, the custom wire 
connections can be an extra metal-1, via-1 insertion compatible with logic processing. 
Removal of the SRAM module provides a low cost standard logic process for the final 

15 ASIC construction with the added benefit of a smaller die size. The design timing is 
unaffected by this migration as lateral metal routing and silicon transistors are untouched. 
Software verification and the original FPGA design methodology provide a guaranteed 
final ASIC solution to the user. A full disclosure of the ASIC migration from the original 
FPGA is provided in the body of this discussion. 

20 In Fig.4, the third module layer is formed substantially above the first and second 

module layers, wherein interconnect and routing signals are formed to connect the circuit 
modules within the first and second module layers. Alternatively, the third module layer 
can be formed substantially below the first and second module layer with the interconnect 



13 



and routing signals formed to connect the circuit modules within the first and second 
module layers. Alternatively, the third and fourth module layers positioned above and 
below the second module layer respectively, wherein the third and fourth module layers 
provide interconnect and routing signals to connect the circuit modules within the first 
5 and second module layers. 

In yet another embodiment of a programmable multi-dimensional semiconductor 
device, a first module layer is fabricated having a plurality of circuit blocks formed on a 
first plane. The programmable multi-dimensional semiconductor device also includes a 
second module layer formed on a second plane. A plurality of configuration circuits are 

1 0 then formed to store instructions to control a portion of the circuit modules. 

In another embodiment, a programmable logic device includes one or more digital 
circuits formed on a substrate; and a non-planar circuit electrically coupled to the digital 
circuits, the non-planar circuit being either a memory constructed to store data to define 
the logic outputs of the digital circuits to fabricate a field programmable gate array 

15 (FPGA) or a conductive pattern constructed to define the logic outputs of the digital 
circuits to fabricate an application specific integrated circuit (ASIC), wherein the memory 
and the conductive pattern options have substantially matching functionality timing 
characteristics. 

The design/conversion of the FPGA to the ASIC is explained next The larger 
20 and very complex FPGA designs are done with computer-aided design (CAD) tools. A 
design specification is converted to a logical entry format for a Design Entry CAD tool. 
The abstract logic functions are described using Hardware Description Language (HDL, 
VHDL) or Schematic Diagrams. The design entry is compiled to extract the netlist. This 
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netlist is used to synthesize the logic to be placed in the FPGA. Design capture so far is 
independent of the FPGA platform. A customized Place and Route (fitter) software tool is 
used to select the logic gates and to make the required connections in a chosen FPGA. 
The design placed and routed inside the FPGA is simulated using test vectors to verify 
5 the performance and functionality. The optimized design database specifies how the 
FPGA programmable resources are utilized to achieve the original design objectives. 

From the information contained in the design database, a configuration bitstream 
is generated by a tool commonly referred to as a bitstream compiler. All the logic and 
routing customization specific to the design is contained in this bitstream, which is a 

10 binary representation of every single configuration device in the FPGA. This is also 
referred to as a bitmap when the bitstream is mapped to the image of configuration 
elements. At the physical level, the defining binary data in the bitstream represent the 
ON/OFF states of the configurable switches that control logic blocks, 10 blocks and 
interconnection in the FPGA. 

15 At this point, the configuration bitstream either may be downloaded to the logic 

array thereby configuring the device or the bitstream may be saved onto disk If the 
FPGA contains non volatile memory elements, a programmer is used to program the 
bitmap into the device. Some non volatile memory (NVM) elements such as EEPROM 
and Flash lend to in system programmability (ISP), allowing programming inside the 

20 design board via JTAG instructions. SRAM based FPGA allow ISP, but need a NVM 
content outside the device to hold the bitstream. 

Even though the design has been fine tuned in software for timing and 
functionality, it still needs to be verified on Silicon. This is due to inaccuracies between 
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the timing model and silicon performance. Having a pre-fabricated generic FPGA makes 
this verification simple and quick. The FPGA device is then programmed and tested in a 
system board to verify operational correctness. If the design does not work it is re- 
optimized to work on silicon. When the design works, it is initially fielded Should the 
device prove popular, the FPGA can be converted into an ASIC by hard-coding the 
bitstream. 

First an image file is generated for all the B contacts that exist between Module 
layer-2 and Module layer-1 in Fig-4. These B contacts represent configuration element 
control of the logic blocks. There is a one to one matching between these B contacts and 
the bitmap generated for the design, as every configuration element is represented in 
both. We can define (1,0) in the bitstream to represent SRAM output at logic (1,0) 
respectively. Bitstream ones represent B contacts at Vcc, while bitstream zeros represent 
B contacts at Vss. The bitstream can be automatically mapped to contact B file to convert 
those to Vcc and Vss hard connections. The contacts B are in the CAD database that 
generates the physical mask for wafer processing. This technique provides an error free 
software conversion of the bitstream to a had-wire mask. By appropriate pre-allocation of 
Vcc and Vss resources above the B contacts, one could conceivably generate the ASIC 
with only one custom mask, a considerable savings in expensive mask costs. All the C 
contacts in the hard mask are simply omitted as no configuration elements exist, while all 
the A contacts are retained. 

The conversion does not result in a new placement and routing configuration that 
is different from the previous FPGA design. The conversion does not result in a change to 
the logic gates in module layer-1 or the lateral wire routing in module layer-3. The 
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vertical contact height change is negligible in the gate and wire delay components of 
logic propagation. Logic gate timing is not affected by control options between SRAM 
output or Vcc/Vss. The timing is maintained identical in this FPGA to ASIC conversioa 
Furthermore, this conversion can be made by the FPGA supplier, with no engineering 
5 overhead, saving valuable design resources at both end user and manufacturing sites. The 
final hard mask ASIC has no soft errors (no SRAM bits to flip), better reliability as fewer 
processing steps and fewer hard wires (one connection to replace 6-transistors) are used, 
and provide a secure environment against "bitstream piracy" - a technique of stealing 
designs by extracting the bitstream from FPGAs. 

10 Next, details of the circuit blocks 104, the configuration circuit 108, and the 

wiring and/or routing circuit 1 12 in Fig. 1 are detailed. 

A variety of digital or analog circuits can be used in circuit blocks 104. These 
circuit blocks include programmable logic blocks to allow user customization of logic. In 
one embodiment, programmable logic blocks are provided to respond to input data 

15 signals. The programmable logic blocks develop corresponding complete or partial 
output logic signals. Registers are used to store the output logic signals and either 
outputting them to output terminals or returning them as inputs to additional 
programmable logic blocks. The registers themselves can be programmable, allowing 
those to be configured such as T flip-flops, JK flip-flops, or any other register. The logic 

20 blocks may contain no registers, or the registers may be programmed to be by-passed to 
facilitate combinational logic implementation. The programmable logic block can be 
selected from one of a pass gate logic, a multiplexer logic, a truth table logic, or an 
AND/OR logic. Fig. 5 shows an exemplary AND-OR PLD Architecture. AND and OR 
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arrays 202 and 204 contain user configurable programmable elements. Fig. 6 shows an 
exemplary AND-OR array gate realization of a three input, four P-term, four output PUD. 
The AND and the OR array 210-212 are shown programmed to a specific pattern. 

In yet other embodiments, the circuit block 104 contains a RAM/ROM logic 
5 block consisting of "logic element tree" or "P-Term logic array" blocks that perform 
logic functions. Fig. 7 shows one such NAND EEPROM implementation of a P-Term in 
NAND-NOR logic array, while Fig. 8 shows the same P-term configuration for either 
SRAM, or hard-wired PLD architectures. Fig. 7 shows two mirrored outputs PI and P2. 
For output PI, an AND gate 232 receives signals from pass transistors 222, 224, 228 and 

10 230. The pass transistor 222 is controlled by block 220 shown in the dashed circle, while 
the pass transistor 228 is controlled by block 226 shown inside the dashed circle. 
Similarly, the upper half of Fig. 8 includes an AND gate 252 that receives inputs from 
pass transistors 242, 244, 248 and 250, respectively. 

Fig. 9 shows an exemplary pass-gate logic 260 connecting one input to one 

15 output. The NMOS pass gate voltage level SO determines an ON and OFF connection. 
Fig. 10 shows an exemplary 4-Input logic MUX implementing an output function O 
where O = I0*S0+I1*S1+I2*S2+I3*S3. In the MUX, only one of SO 270, SI 272, S2 
274, S3 276 has a logic one. The MUX is constructed by combining four NMOS pass 
gate logic elements 280-286 shown in Fig. 9. 

20 Fig. 1 1 shows an exemplary 2-input truth table logic realization of an output 

function F where, 

F = /A*/B*S0+/A*B*S1+A*/B*S2+A*B*S3 (/A means not A). 
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The truth table logic states are represented by SO, SI, S2 and S3. The realization is done 
through six inverters collectively designated 250 and eight pass transistors collectively 
designated 260. Logic states are stored in 4 programmable registers. 

Fig. 12 shows a logic tree constructed with five 2-input truth table logic blocks 
5 320-328 to perform a full four input truth table. A four input truth table has 16 possible 
logic states SO, SI, SI 5. As the number of inputs grows to N, this logic tree 
construction requires 2 N logic states, and 2 (N1) branches in the logic tree. For large N 
values, a full truth table realization is less efficient compared to a partial product term 
AND-OR array realizatioa 

10 In another embodiment, the programmable logic block can be a programmable 

microprocessor block. The microprocessor can be selected from third party DP cores such 
as: 8051, Z80, 68000, MIPS, ARM, and PowerPC. These microprocessor architectures 
include superscalar, Fine Grain Multi-Threading (FGMT) and Simultaneous Multi- 
Threading (SMT) that support Application Specific Packet Processing (ASPP) routines. 

15 To handle Programmable Network Interface (PNI) the processor can contain hardware 
and software configurability. Hardware upgradeability can be greatly enhanced in 
microprocessors embedded in PLD's by making use of the available logic content of the 
PLD device. Programmable features can include varying processor speed, cache memory 
system and processor configuration, enhancing the degree of Instruction Level 

20 Parallelism (LLP), enhancing Thread level parallelism (TLP). Such enhancements allow 
the user to optimize the core processor to their specific application. Cache parameters 
such as access latency, memory bandwidth, interleaving and partitioning are also 
programmable to further optimize processor performance and minimize cache hit miss 
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rates. Additionally, the processor block can be a Very Long Instruction Word (VLIW) 
processor to handle multimedia applications. The processor block can include a cache 
controller to implement a large capacity cache as compared with an internal cache. 

While a PLD can be configured to do DSP functions, the programmable logic 
block can also contain a digital signal processor (DSP), which is a special purpose 
processor designed to optimize performance for very high speed digital signal processing 
encountered in wireless and fiber-optic networks. The DSP applications can include 
programmable content for cache partitioning, digital filters, image processing and speech 
recognition blocks. These real-time DSP applications contain high interrupt rates and 
intensive numeric computations best handled by hardware blocks. In addition, the 
applications tend to be intensive in memory access operations, which may require the 
input and output of large quantities of data. The DSP cache memory may be configured 
to have a "Harvard" architecture with separate, independent program and data memories 
so that the two memories may be accessed simultaneously. This architecture permits an 
instruction and an operand to be fetched from memory in a single clock cycle. A 
modified Harvard architecture utilizes the program memory for storing both instructions 
and operands to achieve full memory utilization. The program and data memories are 
often interconnected with the core processor by separate program and data buses. When 
both instructions and operands (data) are stored in a single program memory, conflicts 
may arise in fetching data with the next instruction. Such conflicts have been resolved in 
prior art for DSPs by providing an instruction cache to store conflicting instructions for 
subsequent program execution. 
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In yet another embodiment, programmable logic block can contain software 
programmability. These software functions are executed in DSP, ARM, or MIPS type 
inserted IP cores, or an external host CPU. Accelerators connected by a configurable 
SRAM switching matrix enhance the computation power of the processors. The 
5 microprocessor has local permanent SRAM memory to swap, read, and write data The 
switch matrix is pre-designed to offer both hard-wire and programmable options in the 
final ASIC. In this situation, the circuit block 104 can be a functional block that performs 
well-defined, commonly-needed function, such as special D/A or A/D converter, standard 
bus interface, or such block that implements special algorithms such as MPEG decode. 
10 The special algorithms implemented can be hardware versions of software. For example, 
algorithms relating to digital radio or cellular telephone such as WCDMA signal 
processing can be implemented by the functional block. Other functional blocks include 
PCI, mini-PCI, USB, UART blocks that can be configured by specifying the SRAM logic 
blocks. 

15 In yet another embodiment, the circuit block 104 can be memory such as a 

register file, cache memory, static memory, or dynamic memory. A register file is an 
array of latches that operate at high speed. This register length counter may be 
programmable by the user. A cache memory has a high access throughput, short access 
latency and a smaller capacity as compared with main memory. The cache memory may 

20 be programmable to partition between the different requirements of the system design. 
One such need is the division between LI and L2 cache requirements for networking 
applications. The memory can also be static random access memory or (SRAM) device 
with an array of single port, or multi-port addressable memory cells. Each cell includes a 
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four transistor flip-flop and access transistors that are coupled to input/output nodes of 
the flip-flop. Data is written to the memory cell by applying a high or low logic level to 
one of the input/output nodes of the flip-flop through one of the access transistors. When 
the logic level is removed from the access transistor, the flip-flop retains this logic level 
5 at the input/output node. Data is read out from the flip-flop by turning on the access 
transistor. The memory can also be dynamic random access memory (DRAM). 
Generally, a DRAM cell consists of one transistor and a capacitor. A word line turns 
on/off the transistor at the time of reading/writing data stored in the capacitor, and the bit 
line is a data input/output path. DRAM data is destroyed during read, and refresh 

1 0 circuitry is used to continually refresh the data. Due to the low component count per bit, a 
high density memory device is achieved. 

In another embodiment, the circuit block 104 can be an intellectual property 
("IP") core which is reusable for licensing from other companies or which is taken from 
the same/previous design. In core-based design, individual cores may be developed and 

15 verified independently as stand-alone modules, particularly when IP core is licensed from 
external design source. These functions are provided to the user as IP blocks as special 
hardware blocks or pre-configured programmable logic blocks. The IP blocks connect via 
a programmable switching matrix to each other and other programmable logic. The 
hardware logic block insertion to any position in a logic sequence is done through the 

20 configurable logic matrix. These hardware logic blocks offer a significant gate count 
reduction on high gate count frequently used logic functions, and the user does not 
require generic "logic element" customization. In both cases, the user saves simulation 
time, minimize logic gate count, improve performance, reduce power consumption and 
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reduce product cost with pre-defined IP blocks. The switch matrix is replaced by hard- 
wires in the final ASIC. 

The circuit blocks 104 can also be an array of programmable analog blocks. In 
one embodiment, the analog blocks include programmable PLL, DLL, ADC and DAC. In 
another embodiment, each block contains an operational amplifier, multiple 
programmable capacitors, and switching arrangements for connecting the capacitors in 
such as a way as to perform the desired function. Switched capacitor filters can also be 
used to achieve an accurate filter specification through a ratio of capacitors and an 
accurate control of the frequency of a sampling clock. Multiple PLL's can be 
programmed to run at different frequencies on the same chip to facilitate SoC 
applications requiring more than one clock frequency. 

The circuit blocks 104 also contain data fetch and data write circuitry required to 
configure the configuration circuits 108. This operation may be executed by a host CPU 
residing in the system, or the PLD device itself During power up, these circuits initialize 
and read the configuration data from an outside source, either in serial mode or in parallel 
mode. The data is stored in a predefined word length locally and written to the 
configurability allocation. The programmed configuration data is verified against the 
locally stored data and a programming error flag is generated if there is a mismatch. 
These circuits are redundant in the conversion of the PLD to an ASIC. However, these 
circuits are used in both FPGA and ASIC for test purposes, and has no cost penalty. A 
pin-out option has a "disable" feature to disconnect them for the customer use in the 
FPGA and ASIC. 
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Configuration circuits 108 provide active circuit control over digital circuits 104. 
One embodiment of the configuration circuit includes an array of memory elements. The 
user configuration of this memory amounts to a specific bitmap of the programmable 
memory in a software representation. 

Suitable memory elements include volatile or non volatile memory elements. In 
non-volatile memory (NVM) based products, configurable data is held in one of metal 
link fuse, anti-fuse, EPROM, Flash, EEPROM memory element, or ferro-electric 
elements. The first two are one time programmable (OTP), while the last four can be 
programmed multiple times. As EPROM' s require UV light to erase data, only Flash & 
EEPROM's lend to in-system programmability (ISP). In volatile products, the 
configurable data storage can be SRAM cells or DRAM cells. With DRAM cells, the 
data requires constant refresh to prevent losses from leakages. Additionally, one or more 
redundant memory cells controlling the same circuit block can be used to enhance device 
yield. 

The components of the memory element array can be a resistor, capacitor, 
transistor or a diode. In another embodiment of the configuration circuit, a memory 
element can be formed using thin film deposition. The memory element can be a thin 
film resistor, thin film capacitor, thin film transistor (TFT) or a thin film diode or a group 
of thin film devices connected to form an SRAM cell. 

This discussion is mostly on SRAM elements and can easily extend to include all 
other programmable elements. In all cases, the design needs to adhere to rules that allow 
programmable module elimination, with no changes to the base die, a concept not used in 
PLD, FPGA, Gate Array and ASIC products today. 
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An exemplary 6T SRAM cell, shown in Fig. 13, needs no high voltage capability, 
nor added process complexity. The cell of Fig. 13 has two back-to-back inverters 350- 
352 whose access is controlled by pass transistors 354-356. In addition, R-load & Thin 
Film Transistor (TFT) load PMOS based SRAM cells can be used for PLDs and FPGAs. 
5 To achieve zero stand-by power by eliminating sensing circuitry, and reduce memory 
element count for low input functions, these SRAM cells are embedded in truth table 
logic (also called Look-Up-Table) based architectures. 

Pass gate transistor 360 logic controlled by SRAM is shown in Fig. 14. In this 
embodiment, the memory cell (such as the cell of Fig. 13) drives the pass transistor 360 

10 to e-affect an outcome. A 5x6-switch point matrix 370 controlled by 30-SRAM cells 
coupled to 30-NMOS pass gates is shown in Fig. 15. Fig. 16 shows the NMOS pass gate 
360 logic controlled by the SRAM in Fig. 14 converted to hard-wire logic. A contact 362, 
connected to Vcc (logic 1) or Vss (logic 0) depending on the SRAM logic content, 
replace the SRAM cell. The SRAM logic mapping to hard wire connections are 

1 5 automatic and done by a software program that is verifiable against the bit-map. 

Similarly, Fig. 17 shows the 5x6-switch point matrix 370 hard-wired by replacing 
the SRAM bits that control NMOS gates with hard-wires to Vcc or Vss. In Fig. 17, the 
bubble may represent either SRAM or hard-wire Vcc or Vss control on NMOS pass 
gates. In the case of Fuse or Antifuse arrays, contact or no contact between the two metal 

20 lines in Fig. 15 directly replaces the programmable element and there is no NMOS pass- 
gate needed. 

The P-Term logic builds the core of PLD's and complex PLD's (CPLD's) that use 
AND-OR blocks 202-204 (or equivalent NAND-NOR type logic ftinctions) as shown in 
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the block diagram of Fig. 5 and one expansion is shown in Fig. 6 with and gates 210 and 
or gates 212. Gate implementation of two inputs (II, 12) and two P-terms (PI , P2) NAND 
function can be single poly EEPROM bits as shown in Fig. 10. The dotted circle contains 
the charge trapping floating gate, the programming select transistor, tunneling diode, a 
control gate capacitor and programming access nodes. The SRAM cell replaces that 
entire circle in this invention as detailed next. The SRAM NAND-NOR array (also AND- 
OR array) replacement has not been realized in prior art as SRAM cells require Nwell & 
Pwell regions that consume large silicon area to prevent latch-up. The SRAM in TFT do 
not have well related constraints as NMOS and PMOS bodies are isolated from each 
other. Keeping the two pass gates in silicon layers and moving SRAM to TFT layers 
allow P-Term logic implementation with SRAM cells and subsequent replacement with 
hard-wires. In TFT SRAM conversion to final ASIC, the bubble on NMOS gate becomes 
a hard-wire connection to Vcc or Vss. 

The length of input and output wires, and the drive on NMOS pass gates and logic 
gate delays determine the overall PLD delay timing, independent of the SRAM cell 
parameters. By moving SRAM cell to TFT upper layers, the chip X,Y dimensions are 
reduced over 20% to 50% compared to traditional SRAM FPGA's, providing a faster 
logic evaluation time. In addition, removal of SRAM cell later does not alter lateral wire 
length, wire loading and NMOS pass gate characteristic. The vertical dimension change 
in eliminating the memory module is negligible compared to the lateral dimension of the 
ASIC, and has no impact on timing. This allows maintaining identical timing between the 
FPGA and ASIC implementations with and without the SRAM cells. The final ASIC 
with smaller die size and no SRAM elements have superior reliability, similar to an 
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ASIC, leading to lower board level burn-in and field failures compared to PLD's and 
FPGA's in use today. 

Next, the wiring and/or routing circuit 112 is discussed The wiring and/or 
routing circuit connects each logic block to each other logic block The wiring/routing 
5 circuit allows a high degree of routing flexibility per silicon area consumed and 
uniformly fast propagation of signals, including high-fanout signals, throughout the 
device. The wiring module may contain one or many levels of metal interconnects. 

One embodiment of a switch matrix is a 6x5 programmable switch-matrix with 30 
SRAM bits (or 30 Anti-fuses, or 30 fuses), shown in Fig. 15. The box in Fig. 14 contains 

10 the SRAM cell shown inside dotted box of Fig. 14, where the pass gate makes the 
connection between the two wires, and the SRAM bit holds the configuration data. In this 
configuration, the wire connection in circuit 112 occurs via a pass transistor located in 
circuit 104 controlled by an SRAM cell in circuit 108. During power-up, a permanent 
non-volatile memory block located in the system, loads the correct configuration data 

15 into SRAM cells. In Fuse or Anti-fuse applications, the box simply represents the 
programmable element in circuit 108 between the two wires in circuit 112. During the 
ASIC conversion this link is replaced with an open or short between the wires. 

Another embodiment provides short interconnect segments that could be joined to 
each other and to input and output terminals of the logic blocks at programmable 

20 interconnection points. In another embodiment, direct connections to adjacent logic 
blocks can be used to increase speed For global signals that traverse long distances, 
longer lines are used. Segmented interconnect structures with routing lines of varied 
lengths can be used In yet other embodiments, a hierarchical interconnect structure 
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provides lines of short lengths connectable at boundaries to lines of longer lengths 
extending between the boundaries, and larger boundaries with lines of even longer length 
extending between those boundaries. The routing circuit can connect adjacent logic 
blocks in two different hierarchical blocks differently than adjacent logic blocks in the 
5 same hierarchical block. Alternatively, a tile-based interconnect structure can be used 
where lines of varying lengths in which each tile in a rectangular array may be identical 
to each other tile. In yet another implementation, the interconnect lines can be separated 
from the logic block inputs by way of a routing matrix, which gives each interconnect 
line more flexible access to the logic block inputs. In another embodiment, the 

10 interconnect routing is driven by programmable buffers. Long wire lengths can be sub- 
divided into smaller length segments with smaller buffers to achieve a net reduction in 
the overall wire delay, and to obtain predictable timing in the logic routing of the PLD. 

Next, a brief description of the manufacturing process is discussed. During 
manufacturing, one or more digital circuits can be formed on a substrate. Next, the 

15 process selectively fabricates either a memory circuit or a conductive pattern 
substantially above the digital circuits to control portion of digital circuits. Finally, the 
process fabricates an interconnect and routing layer substantially above the digital 
circuits and memory circuits to connect digital circuits and one of the memory circuit or 
the conductive pattern. 

20 The process can be modified to fabricate a generic field programmable gate array 

(FPGA) with the constructed memory circuit or an application specific integrated circuit 
(ASIC) with the constructed conductive pattern. Multiple ASICs can be fabricated with 
different variations of conductive patterns. The memory circuit and the conductive 
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pattern have one or more substantially matching circuit characteristics. In this case, 
timing characteristics substantially unchanged by the circuit control option. The process 
thus fabricates a programmable logic device by constructing digital circuits on a 
substrate; and constructing a non-planar circuit on the substrate after constructing the 
digital circuits, the non-planar circuit being either a memory deposited to store data to 
configure the digital circuits to form a field programmable gate array (FPGA) or a 
conductive pattern deposited to hard-wire the digital circuits to form an application 
specific integrated circuit (ASIC), wherein the deposited memory and the conductive 
pattern have substantially matching timing characteristics. In another embodiment, the 
hard-wire ASIC option may be incorporated into the digital circuit layer 100. In another 
embodiment, the hard-wire ASIC option is incorporated into the routing layer 1 10. 

Although an illustrative embodiment of the present invention, and various 
modifications thereof, have been described in detail herein with reference to the 
accompanying drawings, it is to be understood that the invention is not limited to this 
precise embodiment and the described modifications, and that various changes and 
further modifications may be effected therein by one skilled in the art without departing 
from the scope or spirit of the invention as defined in the appended claims. 
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